Dimension Reduction in Regression Estimation with Nearest Neighbor
نویسندگان
چکیده
In regression with a high-dimensional predictor vector, dimension reduction methods aim at replacing the predictor by a lower dimensional version without loss of information on the regression. In this context, the so-called central mean subspace is the key of dimension reduction. The last two decades have seen the emergence of many methods to estimate the central mean subspace. In this paper, we go one step further, and we study the performances of a k-nearest neighbor type estimate of the regression function, based on an estimator of the central mean subspace. The estimate is first proved to be consistent. Improvement due to the dimension reduction step is then observed in term of its rate of convergence. All the results are distributionsfree. As an application, we give an explicit rate of convergence using the SIR method. Index Terms — Dimension Reduction; Central Mean Subspace; Nearest Neighbor Method; Semiparametric Regression; SIR Method. AMS 2000 Classification — 62H12; 62G08.
منابع مشابه
Discriminant Adaptive Nearest Neighbor Classification and Regression
Robert Tibshirani Department of Statistics University of Toronto tibs@utstat .toronto.edu Nearest neighbor classification expects the class conditional probabilities to be locally constant, and suffers from bias in high dimensions We propose a locally adaptive form of nearest neighbor classification to try to finesse this curse of dimensionality. We use a local linear discriminant analysis to e...
متن کاملSoftware Cost Estimation by a New Hybrid Model of Particle Swarm Optimization and K-Nearest Neighbor Algorithms
A successful software should be finalized with determined and predetermined cost and time. Software is a production which its approximate cost is expert workforce and professionals. The most important and approximate software cost estimation (SCE) is related to the trained workforce. Creative nature of software projects and its abstract nature make extremely cost and time of projects difficult ...
متن کاملBoosting the distance estimation: Application to the K-Nearest Neighbor Classifier
In this work we introduce a new distance estimation technique by boosting and we apply it to the K-Nearest Neighbor Classifier (KNN). Instead of applying AdaBoost to a typical classification problem, we use it for learning a distance function and the resulting distance is used into K-NN. The proposed method (Boosted Distance with Nearest Neighbor) outperforms the AdaBoost classifier when the tr...
متن کاملModelling Climatic Parameters Affecting the Annual Yield of Rheum Ribes Rangeland Species using Data Mining Algorithms
Identification of climatic characteristics affecting the annual yield of Rheum Ribes can be useful in management and development of this species in the rangelands. In this research, the annual yield of this species in Khorasan-Razavi province based on 74 climatic parameters during a ten-year period evaluated and affecting climatic parameters extracted using data mining methods. First, the role ...
متن کاملAsymptotic Behaviors of Nearest Neighbor Kernel Density Estimator in Left-truncated Data
Kernel density estimators are the basic tools for density estimation in non-parametric statistics. The k-nearest neighbor kernel estimators represent a special form of kernel density estimators, in which the bandwidth is varied depending on the location of the sample points. In this paper, we initially introduce the k-nearest neighbor kernel density estimator in the random left-truncatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009